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Ape t rac t . The paper advocate* tSc n&cdl fof sy?ter*s uhich support 
maintenance of LISP-type data basee. and describee an b x par 1 Men \ a I system 
of this kind, called DASA. In thia system, a description of the data 

base 1 s structure la kept in the data baa* Itself. A nuntaer of utility 
programs USo the description for operations on the data base* The 

description must" minimally include syntactic information renin i a cent of 
d^ta Structure declarations in more conventional pr ogr amm 1 n gj languages, 
and can ba ok tended ny the uaer. 

Tuo reasons for such systems are seent i}) An A, K prog-rams develop from 
toy djORtilna using toy data bases, to mare realistic eKerci**, ths 
■anegement of tba knouledga baa a ba comas non- trivial ami r squires prog ran 
support. [2] A pouertul u^y lo organize LISP programs is to stake them 
data-driven, whereby pieces of program are distributed throughout a data 
base. A data base management system facilitates the use of tbia 
programming style. 

The paper describes and discusses the basic ideas in the DABA system aa 
uell as the technique of data-driven programs. 
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1. Focus on the data base. 

Irv this paper I w||| attempt (0 say three things at once. That stylistic experiment is undertaken 
not out of Choke, but Wt Of necessity; the three Copies ant intertwined, and none of them can 
be discussed without the conteK- of the others. 

The first topic regards the aflttuAe tn Aala. ham: I shall argue (hat the Current thinking about 
data bases in A.I. has misled an important point, which can be tersely character tied as the 
separate identity af the- data base, independently of <h< prOgram(s) chit us* it. 

The second topic is a COTOltary of the first one, namely the design of systems for management vf 
date tern m fh-c new sense, in the context oF a LISP or LISP-IIke prog ramm I ng system, A 
very experimental management system far LISP data bases Is described. The system provides 
utility operations on the data base, such ai data entry (prompt the user for contributions to the 
data base), presentation (nice printouts) and backup (dump a part of the data base on a file). 
Additional utilities are planned. All Utilities use a description of the data base's structure, 
winch, i j scored in the data bast itself. The structure desenptjon musr minimally contain 
Syntactic information simitar to what one finds In data-structuic declarations in conventional 
programming languages. It can however be arbitrarily incremented by the user. Since it is in 
the data base, the description mutf itself have a description, which is also in the data base, and 
so on until a description which describes itself. 

■ 

This system (called DABA) is motivated partly by the practical problem of maintaining 
collections of knowledge of non-trivial sue,, for use in A.I. programs, and partly by my 



preference for a certain p r ug ramn i n j style, which is here- called data. -driven programming. 
Only a throw-away implementation of DABA eKisti currently: the system is described here In 
order to exemplify; various desirable properties in syslems for base management, and not as an 
avaiUiHf ~rir>l 

The method of rfffftMfrfWI ^P£mntn?(J|f IS Che third topic of the piper. That programming 
technique it frequently used but rarely discussed; the reader whin has already used it will 
recognlie It by a common operation in data-driven programs, namely 

(APPLY (GET ... ] ...I 
In other words, data-driven program* are those where large parts of the program are 
procedures or program fragment! that are stored in the data base, in a less trivial sense than 
Mi EXPK proprties. The paper argues for the use of this technique, This is. relevant to the 
data bas* topic because program management took for darsi-df iven programs, have the tame 
requirement! as data base management took. In fact, the distinction between 'program' and 
'data base 1 become* fumy and unimportant. 

The remainder of section L attempt* (0 tpefleut my view of' data bases, and the idea that Utility 
programs arc an important tool for working with a data haw In the new tente. Section £ 

describes the banc description mechanism in ihe DABA system: section 3 discusses data-driven 
programming m more detail, and section 4 djscusses some simple procedure generation 
technique! in data-driven programs. 

On* of the many definition*. of 'data base' In the world of commercial computing, is 'a 
collection of data which is suitable for use by a variety of different programs', It is implicit in 



the definition that the data base has an eiqjtenc* of its <wr\, and a non-lrivial life-length 
(although Jt may develop and change during its existence}, The definition implies a need for 
separate documentation and separate maintenance of the dasa base. 

Thlt view of the data ba&e Is significantly different from what one finds in A.I. In our field, 

the h data base' has usually bwn an appendix tg or a scratchpad area for the program, created 
during the compulation, and Later garbage collected, or discarded at the end of the run. But 
the separate-identity view nf the data base is appropriate also in the A.I. context, in the 
following cases: 

- ai the U&er-providcd collections of knowledge (hit programs use. It has been common 
practice to use minimal knowledge bases, when programs are run (for several reasons 
including memory problems}, but the time now seems ripe for working w ifh more exhaustive 
collections of knowledge. The problems of Jetting Up, debugging, and editing the 
knowledge base then become non-trivial. 

- as kno^'ied^e generated by or reorganized by programs. Learning programs fin the broad 
sense of the word} are only useful if (he acquired knowledge can be saved for Use during 
later runs, As another example, programmcr's-apprentice-type programs [see e.g. Rich and 
Shrobe* 1914] need to anaiyie the UMr's input program, and form a model of it. That model 
has. to be maintained between runs. 

- ^i data-driven programs, Since programs have to ne preserved between runs, it only makes 
sense to say that a program n a special case of a data base if the tU:a base ii so preserved. 

Let the mo kinds of daa base he called a 'scratchpad' data base (temporary data base during 
execution of a program) and. a 'perennial' data base (has separate identity, separate 



documentation, etc.. Li mainlined bK*een :tins, and is dei.ign*d so that it can conveniently be 
used by several programs). Tn fact, the difference is as much In the way of looVmjr at and 
working with the data ban, as in the design of the daw bast: itself. 

The 'perennial' 01 'separate-identity' view of a dau base is vbtjt similar to Lhc ordinary LJSP 
programmer's attitude ro his program. Working with a, program does nnt mtrdy involve 
running It, bur also various typw of service work: Onr may take out a part of the program 
and re-write it; one may take put a piece nf another program, adapt it, and insert it in one** 
own; one uses pierty-prin: programs, crass-inde?sm and caber look, to obtain readable listings 
and documentation for careful study of the program,, and jo forth. The very same operations 
on a data base come naturally when it develops to non-trivial Slit, 



The major computational implication of the 'separatr-identity' view of Lhe data base i» 
therefore the usefulness of Utility program}, I.e. programs Tike pretty-printers and CTOM- 
indeKers, which serve the user when he taorks with the dsita base, and which are usually called 
directly by the user, rather than as subroutines. Utility programs for operas I oni on LISP 
programs are in common use, a,nd can sometimes be used for data bjsei as well (such as prettv- 
pnntcrs). But a number of additional utilities, as well as additional Options in existing utilities 
are useful for data base opr-THtians. The following are utility operations which I have often 
wished I had had, when working with LiSF-type data bases, and which csist or are planned 
in the DABA system: 

— a data entry utility thai prompts the user for contributions to the data base. In a simple 
ease, instead of letting the uter type in 



IDEFPRQP BOSTON ilftSS TN5TATE1 
{in an elementary object-propffTy representation:), the system would acquire the information that 
GDSTQK is a eLty h and then pTompt the appropriate properties by typing out for example 

BOSTON i INSTATE - 
whereupon the user can answer 

MASS 
The difference in convenience and error rate is of couTJe negligible for Che extremely small toy 
base* That ofren have been used in A I, program, but significant when one enters more 
practical volumes Of data. - In practice, a good data entry utility must Allow for higher-level 
data representation t as well, for nnxed, -initiative dialogue, and for con variational conveniences. 
such as 'undoing:' [Teitelman, ISftl 

■ 

— a dumping utility for saving collection*- of data on files. IF we again true an example in the 
elementary object-property representation, the filing Utility needs a catalogue of carriers (such 
as BD5TQN above) and information about which properties of this carrier shall be saved, and 
It should generate a file which when read will re-create those properties. A basic facility of 
this kind exists in INTER LLSF [Tereirmn, ItfTl]. 

— presentation utilities which print out the data blSf or parts of it in a nice Tormai, so that the 
User can work with It easily. Several presentation methods are possible: an indentation- 
oriented layout is, reasonable when one prints properties which are sizable expressions-, and 
when when one wants to print properties or properties recursively to some depih. A tabular 
layout with several columns is appropriate for atomic properties, and for relation -type data 
bisei where the data base it a set of topics. Such presentation utilniw are similar to the 



dumper, except that they could also mate us& of information about the intended structure of 
propemss. For example, if it u known in a separate declaration (bat the property under a 
certain indicator Ji to be a list which will be used at a set, then an appropriate indentation 
Strategy could be chosen, and one might Sugar the printout with curly brackets. If it is known 
that another property Li a gensym atom, then one might Want to print it in terms of som* of Itl 
propel [■£&, rather than it Lti prinlname 

— a checking utility, to check that all properties in a coltection of data issiify the descriptions. 
that have been made. One «n check against declarations of the inlenrlFd jTruaurt for each 
property (atom of certain type, list of atoms, etc), againjr redundancy rules Of A « getpIBJl 
then B £ getp[A,jD. and so on. 

— a merging Utility. 5upposE that travel COM beiween Cities has been represented as 

ge?tp [BOSTON, THAYELCOSTJ ■ E«¥C [AIR 26,37 BUS 1.3 .751 

TORONTO (AEH lflS. IS ...1 ,„| 
with the obvious interpretation (Boston ■ New York £ 24.37 by air r etc), and that one want* to 
merge [wo files oF data with ilmilar structure. If bath Jites contain properties for the same 
carrierj'jnriicator pair such as BOSTON/ THAYELCClST, then one mud make the obvious merge of 
.the two assigned properties, rather than Let one overwrite the other. A fairly general utility 
program could 00 that if provided with StrUCTUr* declarations for properCi«i. 

— an excerption UUltiy. The inverse of merging' (tor obtaining a prescribed iubiet of (he data 
base), bOT n*eds ihr same struccure information. 



-- a utility for Shift of representation. Suppose we want 10 re-represei>[ the travel cost 

information above as 

OEtp[BQSTON p FL3CHTC0ST] * [NYC <U5I 28.37> TGRDNTD *USJ JB9,ie> ...1 

getb (BOSTON P 8USCOST J - [NYD <US* L3.7&> ...1 
either because qf a whim When changrng out own primary program^ or in order to adapt 
somebody else's dlta to our program. Such a shift should again be doable by some utility 

provided with descriptions of the old and new structure, and their relation. 

The list can easily be continued It is trivial to write prog rims for such opera tians. Tor each 
application or each dlta base arte has. But it is a bother, and one would prefer TO have access 
to more general utility programs. Mora general program* aie slightly harder to write, since 
one wants them to be usable for various higher-level data representations besides the 
elementary object- prop my representation. Depending on the desired flexibility of the 
program, a iirJtrtv program may range from a having exercise to a hard A.I. problem. 

■ 

When a {general} utility program is used, it must be provided with a parameter-type description 
of tlit data structure that it is co operate on. That description can sometimes be integrated in 
the data itself, but often It ts desirable to Write it separately, like a set Of declaims for the 
data representation. In the fetter case, it is also possible to speed up execution by parnally 
evaluating the utility program with respect to the parameters as described in [Bccfcmart et al. b 
Iff74l 

If one hat to write out those declarations for each utility program, then that also can be a 
considerable burden. But it seems tfctt the same decorations or structure descriptions could 



serve several utilities.. For example, ;n the elementary tepreserttaticn where properties art 
assigned to typed objects, one needs information about 

» which properties are carried by each type {used by data entry, dumping anct presentation 

utilities)? 

* which structure Li expected for ihe property under a certain indicator (can be used by 
almost all Utilities, including those for presentation, checking, merging, excerption, and shift 
Of representation. Also, it would be reasonable to th<ci fur appropriate structure during 
data, entry}. 

* redundancy rules, for esifflplc for property inversion (.Used by the chewier, 41 discussed 
above, and could also be checked 01 generated an data entry^ 

* if higher-level data representations ate used, such as context!,, property assignments to noo- 
atnmic carriers, or relational storage wiih pattern-directed retrieval, then all Utilities need to 
know about the storage ton ventions Tor that representation. 

Furthermore, such a structure description for the data base is also part of the desired user 

documentation for the data base. It is therefore a reasonable goal to have one common 
description which can be useo by all utilities, and for documentation purposes, 

■ • 
All points, that have been made so far apply not only to U&P data bases, but also to 
conventional, "bulk" data bases, and arc in face well recognited in the latter environment. The 
LISP environment d«s however offer some additional possibilities. Most importantly,, the 
description of the data base can be stoied in the data base itself, and still be used by the 
program thac operates on the data bate To render this more precise, it is natural to consider 
the data base 81 a collection of dad blocks, where the -description of a daca block Ji a new data. 



blrck which ps also in the data bast. (The regress terminate? if some data block describe* 
itself). The structure descrtption oF a data block will be titled its mtta-hitxi. Utilities can then 
usually be defined is operations on blocks, which \nt 'he meta-block of the argument as 
parameters. 

The idea gf data blocks is tn Ftci useful not only for distinguishing data from their 
dficriptiim, but also for maduiaruing Lhe primary' dau (data which servr ihe purpose of the 
system, a* opposed to descriptions) m the data base. A data block should then be a chunk, or 
data which have a common structure and/or are closely related, by some criterion. It could 
Consist of a set of tuples (- relations) which are stored in the data base, or (in the elementary 
representation) Of a set of property assignments {- triples of carrier, indicator, property). 

■ 

A. word of cauticmr the term 'block' has some ^annotations in computing which are not 
intended In this contest. Nn reourjjve resting or h locks or scope For identifiers, is intended. It 
is in fatt often desirable to disrribuie the properties Of an atom to several blocks. The primary 
■mended, association of the term 'data block' is to the practice oF organizing LISP function 
definitions into 'blocks' or 'fiks' of closely rehted functions. 

An experimental system, called. DAB A, h*s been instrumental in developing and unting some or 

these ideal DABA is a MACLISP program. The next section describes the daca description 
and block structure in the DABA system, and atso discusses other upeeu *hich would b* 
desirable in a more developed system. Before proceeding, let US however add some hand- 
waving comments about this whole approach. 



From the theoretical viewpoint, the central issue in this work a data bait Atstftpum. Utility 
programs enter the picture because th^y probably oTfer the first practical utage of such 
descriptions, but there are abo a longer- range aspects to data base descriptions in A.I. First 
automaTic-programming and programmers-assistant type systemt need not only prog urn 
descriptions, but also data base descriptions, (whenever the program they generate or support is 
10 operate on a data base - and [or A.l. programs that ii usually the case, 

Another reason for warling »t data base descriptions is the need for methods of evaluating, 
comparing, and relating the many proposed representations of 'knowledge* data, such sj 
various 'semantic nets'. Again k is natural to compare with the situation for programs: Active 
work on the theory of computation for almost a decade, has provided a considerable body of 
knowledge about equivalence, complexity, and other properties of programs.. Similar 
knowledge about data structures would be quite useful, just like For programs, mathematical 
logic can be expected to be of some use, but not to cake US all the way. A concept of self- 
descrJbmg data bases might fit well into that picture. 

The idea of denning operation; {such it utilities} on block* of data should perhaps be 
explored as a programming notation. Matrices (of numbers m- other atomic entities) are a 
powerful notation in mathematics and ih programming languages such as A PL [IverSon. t9G2l 
Also, one of the advantage! of relational data bases [Ccdd, fflti\ is that they enable one to us* 
An algebra, on relations [- wts of tuples], which has also been used for some Tormal query 
language?. The same Idea might be useful for specifying search and other quaii-parallell 
operations in L15P programs 



2* Servicing utility operations. 

Th* DABA system an be Used In at l*a,Jt two modes. In the ilmpleu mode, [he user has one 
program, here called the primary program, which UWS the data base. A fiu«riOt>-Answerjng 
program Is a standard CJtimple. As the data basE attain? non -trivial SJic, the user wants to us* 
KJTne utility programs, on the data hue. He therefore has to write down a structure description 
of the data baw foe already has. DABA Jt a system For representing and maintaining such 
descriptions in a sys:erTutic way. plus a collection, of utility programs which use the 
description!, tn the case discussed here, the primary program and the data base existed before 
the DABA faakues *ere called in. (The other mode oF using the system is for managing 
rjaLa-cLrivan programs, ahd will bedisaused in the next section). 

L« US choose a specific example and then describe how Its Structure would be- described to the 
DABA system. We must here select a ^eiy simple sample, which uses an objeopraperty 
representation, in order to concentrate on the description. The DABA system |j however 
useful For daia hs*ei with a richer structure II wdL 

Consider a block of property-list data about cities in the eastern United States. The block ts 4 
set uf property assign men ts, or criples, such as 

UBOSTON, INSTATE, rttSS>. 
<BDSTON p SUBURBS, (LEXlNdtW, REVERE..,. l>, 

■ ■ j 

<WYC T INSTATE, NY>/ 

■fill 

<nA£S t HASCITIES, [BOSTON, LEXINGTON, ...I, 
<HA5S T FULLNAHE. MASSACHUSETTS;.* 
,.. I 

which of course says that Boston is in the state of Massachusetts, and so on. ('._' Indicate*. 



continuation and is not intended CO be m the data base). Each daia block has a nam*, which 
may be atomic (but dew not have to be). Let the atom US -EAST be the name of the above 
block, 

A QLJSP-iike notation Wilt be used, with angle brackets. <„> for tuples - lists. CUrly brackets j._J 
fw mis, and square brackets (...J for free property ■Hits, A propercy-Uu til vl t? v2 -1 n a «t of 

aisigtimeim of vk to ik. so 'he square bracer, expression is really an ahh rev i at i on for 

i<il vl>,<i£ v2>,._ J 
LISP function definitions will be written with round parentheses (,.,), All tbtSe. types- of 
parentheses are assumed to map into ordinary parentheses in the actual implementation. In 
other words, the knowledge that a certain list represent! a .set rather Lhan a Luple. is not assumed 
to be available in that item itself. 

It will be more convenient to specify the concern* of blacks using the access function 

dgetplc, i + nj, where c is a carrier, i an indicator n a block name, and the function returns 

the corresponding- property-value. The block contents above can therefore be described as 

dgBtpIBQ&TDN, INSTATE. US-EAST! - rtASS 

dgetp [BO 5TDN, SUBURBS .US^EAST] - (LEXINGTON, ftEVEHE, *..} 

j . _ 

The description of 4 block in DABA consists of two parts. Consider a. dita block (of which L»S- 
EAST is a top exampJe} and a program which uses the- block !H a data bas-e for question 
answering or some simitar purpose. One couJd wriie down several different block*, using the 
same cpnven<lons f and the program would -.hen presumably be able to use any of these blocks. 
The deicrlptitm tf Ttprennwian sbalL contain a specification which is common to these blocks, 
and which therefore encodes some Of Che contentions that are assumed by the program. By 



contrast, the dtscrtptum of txttnt confams. a catalogue of the content of each block, and other 
information which is local in ihe bled. There are several reasons for making such a 
distinction: economy of storage r« the shared, part of the description is an obvious reason. 
Also r the previously mentioned possibility of partially evaluating a ulitLty or other parameter- 
driveri program with respect to the daft base descripiion, is -only worthwhile it the part of the 
description that is being kept fixed, can be factored out. (There ire however also ways of 
avoiding the distinction, in spue Lai cases when one does nut want to make it). 

The common denominator for the two descriptions » the ierts. En the present esampie, one 
immediately r«Dgnii« different soru of carriers: CUV t &TATE, etc. The description of 
extent for a btncl includes a catalogue of the carriers in each of the sorts, represented as: 
noetp [US-EAST , NODES! - 

ECITY (BOSTON, NYC F .,.t, 57ATE MASS, NY..,, 1,^.1 
. while the description of structure includes the information of what indicators are used by objects 
In each sort, for example that objects of type CtlY may carry properties under the indicators 
INS TATE t SUBURBS, etc. 

The function ngatp is used for getting properties of blocknames, in rbe drscriptior or the 
blocks extent, The function nay snmetjmes umplv make Jn access in the property-list of it* flrat 
argument, in which case it is synonymous to ihe INTER USP gfltp, but it may also compute its 
value by default from an appropriately stored procedure, handle non -atomic block names, etc. 

The description of extent also include information abmit the ballon oF the block, for example 
'as global property-lists', '« prOperty-iist& local to this block', or 'as text file with name ,.,". The 



first use Is expressed sis 

n H wt|i[US-rft5T,ATL.DC] - GLOBAL 

The conventions used m She description of extent arE io some degree arbitrary. One might 
prefer to split up the NODES property so that the set of sorts is ob^med m one access, and [he 
set of carriers in a sort is obtained in arte access for each sort. Such changes would not b* 
significant 

The meta-block of US-EAST (- its description of representation} ts another bloc*, whose name 
might be CITIES. The relationship is, indicated by 

getp [US-EAST p t1ETA J - CITIES 
Some minimal!* needed information in the meta-block is, firs*, which indicators arc carried by 
objects m each ion in the described binck. Thus, since BOSTON and NVC have pruijerties under 
the indicators INSTATE and SUBU3E5, and ilnMThty arc in the sort CI TV, one. should have: 

daetprCITY.CAflrHPRUPS n CIT]ES] - I INSTATE, SUBURBS,... I 
and likewise 

dgetp [STATE, CARRFRDPS, CI TIES] - !HASCiTIES,HASCAP[TAL p . . . I 
and so qn. 

The meu-block should also contain information about the expected structure of properties. In our 

example, we know that properties under the indicator INSTATE shall be atoms, of th* sort 

STATE, that SUBURBS prope*ri« shall be sets ofcliles, and soon. Such conventions could be 

encoded in a straight-forward fashion as 

djetfi [INSTATE, PROPS TRUC, C[TtE51 * <50RT 5TATE> 
dgetp [SUBURBS, PROPSTRUC, CI TIES] - <SET ■cMHT CIT¥» 



In our simple example, all names (block names^ earner*, indicators* sort names. Etc) have been 
atWfll. That ii however not necessary, ana in descriptions af less trivial representations It is 
frequently useful to ta them he rmn-s^mic. 

The meta-block contain; information which might occur as declarations in some other 
programming languages, and in the data description language of a management system for 

large data base*. The important difference is that her* the meta -description ii a new data 
block, m that the usee can use and emend thas information according to his own needs. For 
example, it would be natural to extend the meta-block with information which relates the 
primitives for this dab block (sort* and indicators, in this simple example) to usEr-oricnml 
concepts in a model of the intended application. 

In the actual system, each block may be associated with a number of ■satellite' blocks which 
provide additional but Optional information. User seditions to a meta-block are usually best 
organized as a new satellite block, rather than as a change in the original meta-block. Even the 
PRDP5TRUC property is actually kept in such a satellite bsotk. 

Very often one wants to define access procedures for properties, which compote the property 
from other data in the system, looks up default values, stores properties in alternative locations, 
etcetera. The meta-block therefore always contains ln j ffUJ fttnettm for each indicator, for 
example as; 

dgotptlNSTATE.ACCtSSFH.CitlES] - KC5ETP 
where xgetp is Eh e default access function which docs a tribal (eXpliCit) look-up. Suppose 



however that cme would want 10 define a bbcek US-EA5T2 as an update of U5-EA5T. so that 
properties in US-EAST2 use prr^ernei in US-EAST a* HeTnult. The bk>ck US-EAST2 would be 
described 511^1101^^ CO US-EAST, with the following amendment?: 

0) ngBtp[US-EASrz,n[}D[FQF] - US-EAST. This property assignment belongs to the 
description of eiaeni of US-EAST2. 

(£) gotpEUS-EAST2 T rCTAJ - CI TWO- US-EAST2 needs a different description of structure, 
(In practice its meta-block would have a non*atomic name, but we assume an atomic name here 
for simplicity), 

<J) dge to [[NSTATE H ACCESEFN. CI TttQOl - 
[LAMRrA (C 1 Hi iUK i.^UrZP C [ N) 

[DGETP C ] (NGETP N TiDDIFOFn II 
and similarly for every other indicator that v»ii assigned in access function in the old: met*- 
block CITIES. This access function take* the same argument! as the function dgetp. It first 
checks if" the property ocists explicitly in (he block that Is mentioned as th]Td argument, and 
Otherwise looks a' up In the default bkjrk. (In the actual system, access function; have a fourth 
argument, and can be used fo '%&', 'put', 'delete', and "change' operations). 

Thr block CITIES, which is the meta-bbock of US-EAST, should also in its turn have a, meta- 
block and a dialogue {description of extent). The sorts in the block CITIES arc SC-RT 
(containing the carrier* C[T Tj STATE, etc.) and IWICATOB {eanuintirig the carriers INSTATE. 
SUBURBS, HASEITIES, etc). This structure is correctly described if we hive 

gBtpIClT[E5,r1ETAl - OMEGA 

dgetp[SOflT H CARRPROPS. OMEGA] « IHARRPflOPS] 

dgetpUH0ICATDR.CARRPRDP5,ar1EGH - (ACCESSFN FROPSTRUCr 



plus the appropriate properties on ACCES5FN and PflflPSTRUC. It is then cnrr«[ to define 

per? (omega T nt7TAj - OMEGft 

so that OMEGA describes itself In general, proceeding from a blocks to iheir meta-blocks, ore 
always eventually reach ei OrlEGA, t>ui often The path is longer than m this ex ample. - The 
definition Of the NODES properties for CITIES and DnEfcA are straightforward. 

What has been described JO far Is a banc description system, which might be Suf f icicnt for data 
blocks that use simple representations. ]n an environment where the user has already designed 
his primary program and hit data base, he haj to set up the description of representation as a 
pott factum description of the conventions he has made. If he needs non-trivial access 
functions, he has CO Write them himself, although with E-kill and luck he may be able lo define 
them as smalt interface procedure* that call appropriate parts of his primary program. 
Similarly, the NODES property (- the catalogue) in the description of estent can sometimes be 
computed when needed, from information that has already been tec up by or for the primary 
program, and otherwise the user has to create It. 

Such a basic description is. what js needed by utility programs, as discussed earlier. The intended 
purpose of the DAB* system is partly to ptovidc a coordinated set of such utility programs, and 
partly to provide 'canned' higher-levei descriptions. For example, in specifying the block 
CITMOD in the last example above, the user should oniy ha* to Specify that it modifies 
CITIES (expressed bf an appropriate property assignment to the atom CITMOD). and that the 
meta-btock of GITMOD is e.g. MODIF, where MOD1F would be a meta-meta-block which 
imposes the appropriate defaults fw access functions, NODES properties., etc. in ClTtMOD. 
Similar meta-meta-blKkl art or should be available for other common operations inside and 



between data block s. 



■ 



3* Program/data bs.se integration. 

The DABA system Is not particularly helpful for developing convent ional program*, h is 
however believed to be usefut when one uses an often used, but little recDgn'iied programming 
technique that I call dala-drlven programming. In this, section I argue (hat data-driven 
programming is a sig-nif ica-nt development, and much more than a had; and also that a 
DAR.A-type system can r^cilirare the use of this method- 

A common model for a program in L[$P (and most other languages) u that the program is a set 
oJ" procedures which call each other. Each procedure haj a name. A call from a procedure 
FOO CO a procedure FIE ii maniTeited in that the definition of FOO explicitly mentions the 
name FIE. -Such a textbook model of programs ti not always applicable, Many programs are 
organized as a collection of procedure* each of which is attached lo data nems in a data base, 
plus perhaps, one part which is an ordinary program. In such a prqgram h a procedure f may 
sometimes process its input data by calling procedure? which are attached to them in the data 
base. This constitutes an indirect or data-driven call from the procedure f :u a procedure g. 
Usually tht procedures or program fragment! are stored « properties of atoms, but they may 
appear anywhere 111 the data base. 

A rfu:d-WfiL"F7i program, then ccmsasts oF som? "ordinary' procedures, and some 'data-driven* 
procedures which are invoked through data-driven calls. In most programming languages it is 
difficult or impoislbte to implement data-driven programs, except of the very restricted kind 
thar are obtained in out staierrvtnts where the driving- data are integer* (Forttan, Algol 68) or a 
set of iterm that liave been explicitly declared in the program (Pases.!). It Ii eajy and 



Straightforward to implement data-driven program? with full generality in interpreted LISP, but 
this programming- practice is not fully recognized: TNTERLISP's ttrtktfile syiiem [Teiielman. 
J9"H] provides a lot of service in keeping track of compiled cede, but assumes thai itis stared in 
the 1 'function cell* of the atom, !n MACLISP [Moon. 1^74} the compiler has only very recently 
b«n provided with an option that allpwj it to COmpLl* functions thai are nat EX.PR ar FEXPR 
properties. One should not rreAt data-driven prog ramming as TuieV, thereby implying that it 
should be discouraged, or that it lacks research interest, It is a powerful programming method 
and program structuring method Tor the following reasons. 

— Frwedurtl obtain truly mtantngful namtt. In data-driven programs, each procedure is 
identified not by; a single name, but by a combination of such. For ex ample, procedures that 
are stared directly an property-list? are identified by pair* of atoms. Therefore, the identifier of 
a, procedure can be more than 'mnemonic 3 : it can s:a^E the purpose of the procedure in a 
fashion vhtch can be used by Other parts of the program. 

For example. McDonalds bibliography/ program [McDonald, I97SJ assumes that each biblJC- 
graphy entry is associated with a number of properties such as AUTHOR, TITLE, etc, and 
each property name has on tti prOperty-list procedures for reading, printing, etc. thai property. 
The procedure that is identified as gnUUTHQR,PRlNT-UP'Frfl his such a better-thin- 
mnemonie name. The routine which foes through all desired properties of an item and applies 
Che reading procedures of each indicator, uses the meaning embodied in the 'name'. 

-!- FaUUta.it I aUtcxiaitc program gtntmttm. If program generation is to go beyond the level of 
Toy programs such as trivia] sort routines, the generator must use a model of the program that is 



being gcneraitd The tasj, of Speaking the model, and even more the rasY of relating (he 
model La the program, ire particularly Simple Tor data-omen programs. The actual pingram 
generation cjn then often consist of aen w.\; \ ng individual data-driven procedures or code 
f ragmentE. The lalter C1W irJSeS ii each data-driven procedure has the form 

(LAMBDA (X ._KFOO (code 1} (code 2) ... (code n») 
where each expression (code 1} hai been gentled separately, and where the function TOO is 
the 'glue' between the programs and is response? for communication between them. (FOO 
may be a buLlt-in function such as OR or PROGN r or a function written for the purpose). The 
PCDE system {Sandewa.ll I<fl| h Sandewall J973, Haraldson 197+] uses this method for program 
generation. 

— Uses the QppHcaltffl ftritfHGgfj flrlrf fflfl^f; ft wrf/y extenslbit. The notation that is input to a 
program is or should he a Language which is natural to the application of the program. The 
same holds for the notational conventions that are used in a data base- In both cases, a 
program which is organized around such an application-onemed- notation is likely CO have a 
good structure, and ejctensions to the program immediately reflect extension* to the application 
"language', 

Interpreters are a classical example of data-drivEn programs. Interpreters for conven t lOfl a I 
languages are data-driven wish respect to procedure names (i.e. data of the interpreter) 1 . Recent 
language features inch as pattern-directed invgeation and demons alio assume that procedures 
are indexed from data structures, although in this ease the data of the interpreted program. 
The apparent power of the latest generation of A,l languages tBobrow and Raphael. 19^} is 
perhaps largely due to the fact that they made data-driven programming available to users who 



did not think of uiing Jt explicitly, The claim her* is thai it u often better for the user to 
develop hti own scheme for ' organising his program (in the sense of storing the procedures in 
the ri|ht places) instead oF mmg a single parage oT hug h -level devices, There are also several 
examples of successful data-driven programming around. TheSHRDLU program [Winograd h 
IS72J can be used in support of many claims: it U aha data-driven in several parti. 

The reason why this whole djjcuttion is biought up in a paper about data 0*se description, P* 

Ehit data-driven proems allow procedures tu appear in arbitrary portions in I he data base. 
There are often plenty of relating ips between procedure items and other hems in the data 
base: procedures may have been generated From other data, and program analysts, programs 
may oFten generate daia about programs that should be stored In the data base, so rhat it does 
nor have to be re-generated repeatedly. It is th tfl natural to use one's data base management 
system for managing programs and program description s u welL 

The Contrast between the situation described here, and the titration described in the prevtDui 
section ts characterijeci by figure J, In diagram (a), (be large triangle is the primary program, 
the small triangle the utility programs, and the data base is desalted bj the DABA system for 
the utility programs. In diagram (b), the primary program consists mainly oF data-driven 
procedures which are also managed by DABA. 



Figure I 
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There is also Another reason: [he data rhat data-driven firoeedures ai* associated with, can 
sometimes be 'object' data for the system, but very often it II natural (0 Choose them as items 
that appear in the self-descrtptl&n of th? data block, for example indicators Ot sort names. 
Thus the descriptions of a data block are often an appropriate framewort for organizing the 
program. 

Most Utility programs can with advance be data-driven. For example, a presentation utility 
COUld be driven by printing procedures associated with property indicators. Thu is a, 
commonplace Idea, but raisss &ome practical problem;. Consider the following scenario: we 
have acquired a large data base (large by A.t. standards., that is), consisting of several block* 
with different Structures. Vp'b are also using a number of different Utlhties. each of which 
drives specianl«d procedures for all or some of the blocts. Furthermore, descriptions of" the 
data blocks sit around and are d irmly interpreted by several of the utilities, and are ui*d for 
generating speciltlled procedure? for some others. Suppose now that we want to mqve thjj. 
battleship a bit P for example: fa) modify the structure nf some dara block, (b) delete a data 
blo;'K, (c) discard a utility. Tht first operation implie* a dumber of other changes In the system: 
the other two enable non-trivial garbage collections, in a large system with a considerable life- 
length, such garbage collections, are necessary (even if one has infinite memory, he Still wants to 
Jmo* what is, garbage so it does not have to be updated). 

_ 

in order to support, such such Unapt* operations, and also in order to support the user who wanes, 
to understand the syirem, so Lhat he can perform more complex operations, on it. one needs a 
model of the structure of the system. Here agirt '.he bloc* structure and other concepts in 
DAE A are useful' 



Let Hi exemplify (bat, again wish a simple example. Consider a pretty-prim mg utility program 
P, which Qptu&i, on a data biock B whose m«a-bl«:t a.etfi(B,t1ETAj - n, The program P 
makes use of specialized printing procedures and other parameters which apply to all blocks 
which like Q have the structure described in fl. These paiamerets Lug ether constitute a data, 
block MP. (They might b* included Ln fl ir&eif. but it n not desirable to clobber M with auxiliary- 
procedures for all utilities, and thejefore we prefer to let each utility define it* own 'satellite' 
block to n). 

Tbe block MP ha? the same sorts and the same catalogue as M, but uses different CARRPR0P5 
assignments. For esamrile, in our initial geography example the block II contained PRDPSTRUC 
and ACCESSFN properties, for HASCITtES and other indicators used in B. The block HP would 
contain a PfliNTFN property for HA5CIIIE5, *nich P then HH, The relationship better MP 
and. M should be e* pressed by a reference such as 

rtgetpmP, DESCRIBES] - n 
This reference should imply a default value for the NODES property of IIP, 

The mtte-Mack for MP must, be a block which describe! the Structure of the parameter j that the 
program P aiWJmes. i.e. it if part of the documentation of P.- In the present DABA system, 
utility programs are integrated with their specification, so a data-block P contains both the s#t 
of procedures that mate up the utility program, and the information thai mates P a suitable 
meca-block for MP. For example, P contain n reference to the knowledge about how to compute 
th* NODES property of HP from its DESCRIBES property. (Actually, that knowledge , s conve>v,L 
to P by m meta-block). The structure oF these blocts it illustrated in figure 2. 



FIGURE 2 
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This example illustrates How blocks of data arc not ihrrely tluite» with dense internal 

connections: there ire also relationships be^een blocks, such as the OETA relation, the 
DESCRIBES relation, the flQOl FOF relation (used in an sample in the- previous section). 
Several other relations are important, such as the relations between a program, the block of data 
that was input to it, and. ibe block or data thai Lt produced as result. Relations between blocks 
ar* macro-lev*! descriptions, which complement the micro-level, declare tion-c^pe descriptions 
such as CARR?fiDPS or PROPS TRUC property. 



4- G-g iteration of procedure^ 

The DABA system as such assmv*es that data base descriptions contains procedures, namely 
i««I functions, and specialized procedures, for vanou* urlHty program!, in addition,, manjf 
applications, may involve da'a-drWen programs as discussed in the previous section. 

Where do These procedures COrne from? The simplest case is oF course where they are always 
written by the u*er. There are however several ways whereby the user can be relieved of this. 
responsibility, or at leiSI of sdmr nf the dnadgerv involved. 

One obvious method is, by default computation. IP the procedure does not ex I it, then it ii 

computed by a procedure which may derive It from other data, ask the usex h etc. This ii 

mrjgmplJihed in a Simple and uniform Fashion in DAB A through a reruTitvt acccis-funcltim 

itirthtthtsm. The Function dgetp which was used in section 2 to obtain data from the block 

USC I TIES, is defined approximately as 

dgietp Ic + i f n] ■ 

if naOriEEA then ejetptci] 

else apply [ dflctp[t,ACCESSFN..getpln t nETfl]]. I ist Ic, 3 ,nll 

In other words, in order to dgetp the HASCI TTE5 property of MASS, one retrieves and uses the 

ACCESSEN property of HftSCITTES in the roeta-block. But for that, he must retrieve the 

ACCEESFN property aF ftLCES'SFfJ in the meta^meta-block, and SO en. (At least theoretically; the 

recursion » jcmetimes shortcut). The recursion terminates at the ultimately h meta H blcxfc OflEGA. 

This mechanism is a f'exJble way of defining appropriate access functions. For example, in 



section 2 we discussed the modified data bSo^k US-EAST2, which modified (he blocV US-EAST, 

and where 

getp[U5-EA5"f,HETAl - CITIES 
getpnJS-EA5T£,riE™ - C1TM0D 
ngBtpIU5-£ftST2.nOD]FDFJ - US-EAST 

Here the user should not Mve to write out the aeeesi function* for CITrirjD. Instead, there 

should he a data bloci. PIDD which describes, modification blocks in general, so that 

eje tp rC[Tr1QD t rlETA] - HOD. The access functions- in CI TOOD aie obtained at 

dgetp [ACCESSFN.ACCESSFN ,11001, and might be the one ml lined in section J, or (improved) 

the following go and get the access function for the Jime indicator in the bloc* CITIES. Try 

using it in the current block {in this case, US-EAST2). If no ie.ili.lt, (hen mate an access in 

dgetfj [current block. UDDIFQF], 

In fact, all other system properties,, for example CARRPflOPS. are accessed in the same way utlng 
dgstp. It Ji therefore not necessary to invent a new atom as a nam? Tor C3 iriCTJ. its name is 
chosen as [MOD C1TIESI, whereby it is implicitly specified to be a block whose m«a IS HOD and 
which modifies the block CITIES. (The actual DABA notation is slightly different). In 
general, the method of defining properties of blocks ".hrDugh access functions til the rneta block, 
complements, well the method of using non-atomic ('molecular*) names few blocks, where the 
contents, of the block, or at least some of the content)., are implicit in the name of the bJoct. 
The advantages with non-atomic block names are analogous to the naming advantages of data- 
driven program! 

Utility programs whjch us,e specified parametric procedure? also accew them with the function 
dgetp, which means, that the same kinds of default mechanisms can be used for their 



parametric procBd u re* such as FR1MTFM. The recurve atCHs mechanism is quite powerful, and 
enables one to implement a number -of desirable facilities with a very small kernel system. Its. 
major drawback ii that higher-order access functions of access functions are usually less than 
transparent to read and underhand, Efficiency may alio tie a problem, which hopefully tan be 
handled by saving access functions so they only have Eg be computed once ('memoLiaiton'}, and 
using automatic simplification of the tower levels of access function*. 

Sometimes a procedure Li to be built up and modified in several successive ttrps. It must then be 
initialized in some *-ay (for Cample Uy Us mcta-leve] access function), whereupon it can receive 
attviie which successively modifies LL For example, if a data-driven procedure contains, or 
refers to a set of theorems or demons that are to be triggered by the indexing- data, then each 
idvlie might contribute one more theorem or demon to the structure. A program for 
simplifying LISP expression* might associate with each LISP function a simplifier procedure 
for formi where it is the leading function. A new simplification rule, such as 

{CAR (LIST itf JIVJ] -> JX 
would then he sent as a meiiage to the simpiifier for CAR. (The REDFTJN program [Sandewall 
197L, Becitman et al. 1^4] works in this fashion), ■ Sometimes the advtse that is given to a 
procedure is less uniform, INTERLtSP LTeitelman 19741 contains facilities for user-specified 
advise to the entry and exit parts of arbitrary user procedures. In the DABA system, it It 
frequently desirable to let various itemi send advise 10 an access function or class of ac.ee-', 
functions, telling- it where to find esplirir and default data, whether and how to 'mcmaiie' 
computed data, and so on. 

Several of Hewitt's actor ideal [Greif and He-^j"., ]9"74] carry Qv« to this purpose. What we 



hive called advise is a kind of message. Giving advise is lite an actor 'handshake-': ihf 
receiver of Che merges must be the one who knows how to incorporate it into his internal 
Structure. There is a need for actors m the sense of objects which both receive- and send 
messages. But chains of messages which trigger each other are here only a secondary purpose' 
the primary purpose of a message is to modiff a procedure or other data item.. Also, It is 
mandatory in our case to have an option for saving a protocol of which messages were srm 
where, so that later changes eaTly in a message Chain can perpetuate along Ihe chain. Such a 
protocol should of course be stored as another data block, Ln line ■with the general philosophy of 
the system. 

In fact, the data structure becomes cleanlier if messages are writ to blocks, rather than to 
individual procedures. Thus a simple meta-bl«k might receive messages saying- 'tell all your 
accesi functions to pick default values from the block B' or 'tell your access function Tor 
SUBURBS to get the explicit value, and filter away from it alt proposed iuburbi which are. not in 
the same state 3 . The structure oi [he blocks involved sboufcd be as follows; 

B block ro which messages are writ 

M getplB.riETAI 

BA bteck of messages that have besn sent to B 

A getp[BA H MElAl 

Here A should contain the In Formation about how to interpret incoming messages, and it shoubd 
be on the same level is n. In other words, if B and B" have rhe same PI, and BA ' is like BA but 
for B\ then BA and BA" should be able to share rhe same meta-block A (figure 3). 
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POINTS FROM BLOCK TD ITS rCTA-BLOCK. 



Pdir-JTS FROM A MESSAGE-HISTORY BLOCK TO THE 
BLOCK THAT SENT AND RECEIVED THE MESSAGES. 



FIGURE 3 



The previt>U*l} mentioned methods for defining binds- through their meta-blacts. sometimes 
using ncn -atomic bkcMzk names, are useful for prom ml blocks as well, Protocol blocks are 
actually given non j a.tomtc names such as [PHDTOCDL B) ralher than BA. The HETA or this 
block is then implicitly PROTOCOL. Jf ft hn an idiosyncratic structure, then a corresponding, 
tailored meca-blocl: for protocols will be needed, i.e. PROTOCOL is really (PROTOCOL^ tt) r and 
whit we u*«| to call BA ha? a name of the form [ [PRQTDCflL* fi) BP. The meta-meta-block. 
PROTOCOL* cnfnputes parts of the protocol meta-binck [PROTOCOL* Ml from M. 

Protocol meia-blotks such aj PRQTDCOL of (PROTOCOL* flJ also contain the decoders for 
Incoming messages, Each tranimtsilon of a message is called an event. An event knows what 
message l( conveys its source, and ils destination (where both of the latter are block names). 
The event is also a member of the protocol blocks of its source and its destination, although 
with different properties in those t^O, A Simpk executive keeps IracX of a queue of events, and 
for each event looks Up the decoder in [he protocol blo;k of the destination block of tliac event, 
and applies it. 

This message-sending facility is net intended as some kind of programming system. If the 
DAB A syswm is used, as m section 2 of this paper, then it does not affect the user's primary 
program at all. h ls mended as a mechanism for performing and, keeping track or updates to 
the- data base (including data-driven procedures), so that later changes in the data base (in the 
separaie-ioernity sense of the ^ord) obtain appropriate secondary effects, Also, messages are 
Onljc sent "to procedures" (loosely speaking) for changing them, not J'oi invoking them. 



5 1 Other Aspects. 

Some aspects of the DAB A system have been mors nr less ignored in this paper. We have 
remarked, [hat each block needs a descripfion of structure, and a description of extent. The 
description of structure IS the nMa~b|ock. and hit ttt meta-blocl, and SO on. Tlie description of 
extent, or 'catalogue 1 , i; at least Ln simpi* cases the set of properties of the blockname. But it 
alio needs a meu-block, Where for example the access function for the names NODES property 
ll located (in the c*s« where the NODES property Is computed from Other information). The 
mela-block of a block B, and the meta-olock of the catalogue of B are not in general identical 
but the latter IS derivable from 'the former. Alio, the catalogue block of the catalogue block is 
computed ai needed (storing it explicitly would lead to an inrimte regress). The resulting 
structure is powerful, but unfortunately also tends to become fairly complex. Laier generations 
of the system vill attempt to simplify it. 

Another aspect which has not been covered is the relationship between the description structure 
Of DABA on one hand, and problems In the representations of knowledge, such as 15-A link 
problems and frame systems (Mins^y 19^] on the other hand, In formation in DABA meta- 
blotkl such as CARRPROPS and ACCESSFN information corresponds vaguely to what one 
needs in those cajes, but The correspon dance is not tmJaL 

DABA is presently a MACLISP program, although it should be relatively easy to transfer It to 
other LISP dialects. It contains Simple utilities for data entry, checking, dumping, and 
presentation. The utilirLej are data-driven and their structure is described within the system, as 
described above. The current system also contain; facilities for teeping track of all blocks; and 



a Few genera E-purpow facility such IS comment blocks (for artltrarjr other blKki) arid update 
blocks. The message-sending facility f«f update of procedure* hut been specified and Js 
probably the ne*t to be tmnlemented.. After that, the present implernentaticin will prabjbly have 
served its purpose, and the nex; generation of DABA wil] be due. 
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